fix: rope_scaling for vLLM and HuggingFace context extension#976
Open
dzorlu wants to merge 1 commit intoNovaSky-AI:mainfrom
Open
fix: rope_scaling for vLLM and HuggingFace context extension#976dzorlu wants to merge 1 commit intoNovaSky-AI:mainfrom
dzorlu wants to merge 1 commit intoNovaSky-AI:mainfrom
Conversation
## vLLM Fix (ray_wrapped_inference_engine.py) - Use `hf_overrides.rope_parameters` instead of direct `rope_scaling` kwarg - vLLM >= 0.8.3 requires rope config via hf_overrides - Convert OmegaConf DictConfig to regular dict to avoid struct mode errors - Reference: https://docs.vllm.ai/en/latest/examples/offline_inference/context_extension/ ## HuggingFace Fix (model_wrapper.py) - Set `rope_scaling` on model config object instead of passing as kwarg - HuggingFace models don't accept rope_scaling as a from_pretrained() kwarg - Fixes: TypeError: Qwen3ForCausalLM.__init__() got an unexpected keyword argument 'rope_scaling' 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Code Review
The pull request effectively resolves the TypeError issues related to rope_scaling in both vLLM inference engines and HuggingFace FSDP training. The changes correctly adapt the parameter passing mechanism for rope_scaling and rope_theta to align with the updated requirements of vLLM (using hf_overrides["rope_parameters"]) and HuggingFace (setting directly on model_config). The implementation includes robust handling for OmegaConf DictConfig conversion, ensuring compatibility and preventing runtime errors. The overall solution is well-aligned with the problem description and provides a clear fix for context extension with YaRN.
dzorlu
pushed a commit
to fleet-ai/SkyRL
that referenced
this pull request
Jan 27, 2026
paper.md: - Simplified overview (no infrastructure details) - Added per-environment breakdown tables for v0.1 and v0.2 - Removed Critical Issues section (moved to model_issues.md) - Added empty results tables for held-out environments - Reference model_issues.md for fixes model_issues.md: - Added v0.2.1 changelog entry for rope_scaling fix - Links to upstream PR NovaSky-AI#976 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
dzorlu
pushed a commit
to fleet-ai/SkyRL
that referenced
this pull request
Feb 4, 2026
paper.md: - Simplified overview (no infrastructure details) - Added per-environment breakdown tables for v0.1 and v0.2 - Removed Critical Issues section (moved to model_issues.md) - Added empty results tables for held-out environments - Reference model_issues.md for fixes model_issues.md: - Added v0.2.1 changelog entry for rope_scaling fix - Links to upstream PR NovaSky-AI#976 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes rope_scaling handling for both vLLM inference engines and HuggingFace FSDP training to enable YaRN context extension.
Problem
When using rope_scaling config (e.g., YaRN for context extension), training fails with:
TypeError: AsyncEngineArgs.__init__() got an unexpected keyword argument 'rope_scaling'TypeError: Qwen3ForCausalLM.__init__() got an unexpected keyword argument 'rope_scaling'Solution
vLLM Fix (
ray_wrapped_inference_engine.py)rope_scalingparameterhf_overrides["rope_parameters"]insteadHuggingFace Fix (
model_wrapper.py)rope_scalingas afrom_pretrained()kwargFiles Changed
skyrl_train/inference_engines/ray_wrapped_inference_engine.pyskyrl_train/model_wrapper.pyTest Plan
🤖 Generated with Claude Code